Letter Sequence Labeling for Compound Splitting
نویسندگان
چکیده
For languages such as German where compounds occur frequently and are written as single tokens, a wide variety of NLP applications benefits from recognizing and splitting compounds. As the traditional word frequency-based approach to compound splitting has several drawbacks, this paper introduces a letter sequence labeling approach, which can utilize rich word form features to build discriminative learning models that are optimized for splitting. Experiments show that the proposed method significantly outperforms state-ofthe-art compound splitters.
منابع مشابه
$Z_k$-Magic Labeling of Some Families of Graphs
For any non-trivial abelian group A under addition a graph $G$ is said to be $A$-textit{magic} if there exists a labeling $f:E(G) rightarrow A-{0}$ such that, the vertex labeling $f^+$ defined as $f^+(v) = sum f(uv)$ taken over all edges $uv$ incident at $v$ is a constant. An $A$-textit{magic} graph $G$ is said to be $Z_k$-magic graph if the group $A$ is $Z_k$ the group of integers modulo $k...
متن کاملChinese Event Descriptive Clause Splitting with Structured SVMs
Chinese event descriptive clause splitting is the task of splitting a complex Chinese sentence into several clauses. In this paper, we present a discriminative approach for Chinese event descriptive clause splitting task. By formulating the Chinese clause splitting task as a sequence labeling problem, we apply the structured SVMs model to Chinese clause splitting. Compared with other two baseli...
متن کاملPredicting the scribe behind a page of medieval handwriting
This paper addresses the issue of attributing pieces of medieval handwriting to scribes known from other examples of writing. The system is applied to manuscript page images and performs extraction and comparison of letter shapes. Letters and sequences of connected letters are identified by means of connected component labeling. This is followed by further splitting into letter-size pieces. The...
متن کاملSynthesis, Radioiodination and Biodistribution Evaluation of 5-(2-amimo-4-styryl pyrimidine-4-yl)-4-methoxybenzofuran-6-ol
This study describes the organic synthesis of 5-(2-amimo-4-styryl pyrimidine-4-yl)-4-methoxy benzofuran-6-ol (SPBF) as an example of a benzofuran derivative used as a new series of amyloid imaging agents. These benzofuran derivatives may be useful amyloid imaging agents for detecting B-amyloid plagues in the brain of Alzheimer’s disease. The precursor is 1-[6-hydroxy-4-methoxybenzofuran-5-yl]-p...
متن کاملAnalyzing and Aligning German compound nouns
In this paper, we present and evaluate an approach for the compositional alignment of compound nouns using comparable corpora from technical domains. The task of term alignment consists in relating a source language term to its translation in a list of target language terms with the help of a bilingual dictionary. Compound splitting allows to transform a compound into a sequence of components w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016